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ABSTRACT 



in this paper we explore sevefai problems in the use of natural : 
language iiSieraction as a measure of language proficiency. The approach 
developed here is based oh observation of natural language interaction, 
with a goal of distinguishing between the effects of change in discourse- 
contexts and. change In the language abilities of Individuals over time. 
We use maximum likelihood techniques to estimate the effects of discourse 
contexts on length of utterance. We then calculate the probability that 
utterances will be as long a^ those observed in each discourse context. 
This prtobabiltty becomes the\baSiS for constructing a weighted index of 
utterance length. Our approach Is tested on language Samples from Spanish/ 
English bi lingual children and; compared to other Indicators of language 
abi 1 i ty . 



bISeOURSE-SENSITIVE MEASUREMENT OF LANGUAGE DEVELOPMENT IN BILINGUAL 
eHILDREN* 

Robert Berdan and Maryellen Garcia 
Introduct ion 

In this paper we look at the Impact of a variety of discburse 
characteristics on children's use of Spanish and English, as measured by. 
length of utterance. Discourse contexts intervene strongly in the 
relationship between length of utterance and Jahguage development- 
Procedures are introduced here for d i st i ngu i sh i ng between the effects of 
discourse. context , and changes in the language abilities of individuals 

over time. This allovis the possibility of more accurate ly^measur i ng 

. \ — 

growth in language proficiency by observation of natural language 
Interaction- 

The present work is part of on-going longitudinal studies of 
language devetopment'^in bilingual contexts (Garcia, Veyna-bopez, 
Siguenza £ Torres, 1962). These studies include primarily naturalistic 
observations of children in Spanish-English and Korean-Eagl i sh 
environments on a monthly schedule. Over the course of the study, 
various of the children will range in age from approximately four years 
to ten years. The ch i I dren are be i ng observed in their use of English, 
as well as in the language of the home, Spanish or KBrean. The study is 
being undertaken to document the nature of the lahguage development 
process for children in bilingual contexts, with particular interest in 
relating that process to educational practice. 

Desired Character i st4^s—Mv^4te^sure of Language Development 

These characteristics of the longitudinal study impose a series of 
constraints or desired properties for an acceptable indicator of - 
language development. The need, however, is hot unique to this 

*A number of people have been ext reme ly helpful in the development 

' 5f this paper. Consuelo Siguenza did the discourse cod i ng on which the 
analysis is based. Dr. Pascale Rousseau reviewed an earlier draft, and. 
provided many helpful comments, particularly on the use ah^ ^ 
iriterpretatJon of the maximum likelihood estimates. Dr. Alvin So 
performed the many passes of computer data analysis. . : 
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ISngitudinal study. Both In reieiFch and in practical applications, the 
need for improved measures of language development continues. We^ 
Identify the following five characteristics as particularly desirable: 

1. it should be fine-grained, able to detect change over fairly 
small units of time, i.e.* months or aggregates of several 
months . 

2 it should be 1 ahguage- i hdependent , or capable of comparable 
forms across. languages of fundamentally differing linguistic 
structure!.' 

3. it should be continuous across the age range ^-10, and 

desirably be continuous from infancy to adulthood. , 

k It should be capable of measuring language in use, d|rectly_or 
indirectly, rather than the abstract notion of "Itnowledge of 
language." 

5 it should be sensitive to a school-based notion of language as 
proficient cofnnuhicat ion, rather than narrowly based on 
language as linguistic structure. 

This is a highly ideati^d list of properties. No such measure now 
exists; it may not be obtainable. Nonetheless, the analytic approach 
talcen below seems to offer at least some promise in each of these areas. 

Measures Per frow^atural Discourse 

Measures of language development can be grouped in a number of 
different ways. Among these we find it useful to distinguish between, 
those which, employ primary charactrer i st i cs of language, and those which 
employ secondary, or derived characteristics. By primary 
characteristics we refer to measures which derive f rom 1 inguistlcaily 
well-defined elements and relationships. Indicators of the use of a 
great variety of language characteristics fall into th! s category 
including such things as inflectional morphemes (e.g., plural marlcers, 
verb agreement markers), relative clauses, and various other syntactic 
marlcers or constructions. 

AH of these elements of language do develop over t>ime. Ghart'lng 
change in their frequency and distribution through time Is an important 



character izatidh of the language development process. Many of these 
Indicators, hSwever, tend to stabilize early In thfe acquisition process. 
Research has shown that for monolinguals they tend to be largely 
acqaired by the onset of schooling, or shortly thereafter (cf. 
Berko-Gleason, 1971; Menyak, 1971). Those indicators which are most 
important for d,i st i ngu i sh I ng specific stages of language development in 
children over a wide age range seem to occur quite Infrequently in 
natural discourse, thus requiring specifically structured elicitation 
procedures to occasion reliable frequencies of observation. Such 
stractufihg is not possible in a study that focuses primarily on the 
child's natural interaction with a variety of interlocutors. 

Secondary indicators, on the other hand, measure language 
development in global, rather than particularistic, terms. Some such 
indicators that have been suggested for educational applications are 
based on T-units (Hunt, 1965) or communication units (Loban, 1976), and 
overall measures of utterance length (Brown, 1973; Cazden, 1968). These 
secondary measures can be distinguished from primary indicators in the 
following sense: Language learners can be said to be acquiring the use 
of plural markers or agreement markers, or any other primary language 
characteristics, with i ncreas i hg probabi 1 i ty acrosi time, and this "is 
readily explairted in terms of theoret i cal 1 y wel 1 -mot i vated linguistic 
processes.^ . 

0n the other hand, to say that children are acquiring the use of 
more words per sentence is not well defined i n any generally supported 
theory of language or cognitive development. This is not to say that 
children do not use longer sentences as their 1 anguage ab i 1 1 ty deve 1 ops . 
They do; but children also engage In increasingly complex topics of 
discourse, ^Jth increased demands for information transfer. Children's 
cbntrol of prTmary language characteristics, such as the processes of 
syntactic embedding, also increase. Thus, children have at thel^r 
disposal an increasing array of linguistic devices for ^expressing 
increasingly cbmplex messages. These primary processeQjnder 1 ie the . 
derived or secondary relationships between nUmbisr of words and the 
syntactic units containing them. Analysis of the pri^mary linguistic and 



cognitive chafacter 1 st Ics , however^ ts extremely CvOmplex and rastjyi in 
some cases there is hot cveh a weil agreed apon basis for the 
classification of observations. The secondary characteristics, however, 
expressed as a ratio of lexical units to syntactic uni ts, confound 
language and cognitive and social development In a Way that may 
frustrate academic researchers, bat which seems to characterize fairly 
reasonably the somewhat confounded notion of language proficiency which 
educators find most appealing. As outlined below, we have used Ih-our 
analyses the secondary measure of words per sentence, or utterance 
length, as ah indicator of development. 

Hean lernH^i 6^^it.terance; Measuring discours e eff^c^s- Mean 
length of utterance (MLU) has been widely used to report early stages of 
acquisition both for English-speaking (Brown, 1973; Bloom, 1970; Cazden, 
1968) and Span i sh-speal< i ng children (Br isle, 1972; Peronard, 1977; 
Padilla 6 Liebman, 1982). Despite the fact that large -studies show that 
MtU Increases monotonical ly with age through the school years (at least 
for samples of writing; see, for example. Hunt, 1965; O'DonneM, 
Griffin, 6 N|rris, 1967), MLU in any of its var ious I y ca I cu I ated forms 
has not shown much usefulness past the two or three word stage of 
development. A variety of problems related to clinical use of MLU as an 
Indicator of development are treated in a collection of articles 
reprinted in Longhurst (197^). Many of these problems seem to derive 
from the many ways in which discourse structure Influences utterance 
length, and the confounding that Is introduced by sampling fluctuation 
across different discourse contexts. 

1. El I Ipsis . One of the most obvious of tliese Is the phenomenon 
of ellipsis. The rules for ellipsis In English discourse are rather 
complex (cf. Halliday 6 Hassan, 1976). Ellipsis may be characterized 
generally, however, as the honrepet 1 t ion of identical information across 
adjacent tarns in a conversation. Consider the following exchange: 

Interviewer: What do you thinic they're going to do after they 
finish eating? 

Chi Id: Play; (fol 1 el I ipsis) 



this response contains just one word. Yet It is fully as appropriate 
for the <Jiscour5C 65 would any of the foUbwing forms have been: 

I think they're going to piay after they finish eating, 
(no ei I ip>si s) 

j think they're going to ptay". (partial eiiipsis) 

After they finish, they're going to play. (partial ellipsis) 

The mor-e extended responses might have offered positive evidence of 
fairly weil developed English fluency; the use of the most elliptic 
form, however, offers ho negative evidence. In some instances the 
failure to use ellipsis where It is possible makes the conversation 
sound unnatural. • 

The discourse appropriate ellipsis of the exchange above is quite 
different from the child's turns below, which are the_ same^ length and 
are superficially similar, but do not show syntactically well-formed 
el I jps is: 

interviewer: What do you do when you play with your friends? 

Child: Game. 
Interviewer: What kind of games do you play? 

Chi Id: Toy. 

Interviewer: Okay, do you have a favorite toy? . . . Tell me 
about your favorite toy, okay? 

Child: Space. 

Interviewer: It's a space toy? What do you do with it? 

Child,: Playing.' l , 

These turns by the child are "contingent" on the prior, tarn, in the 

sense developed by Bloom and col leagues (e.g.. Bloom, kocissaho 6 Hood, 

i97b). They share th|^ same topic as the previous turn, and add new 
Information to it. *hey are not elliptic In the conventional sense, 



but 
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are fSlHer what BraiHe (i9?i.:A56) termed "pseuao-e! liptlcal They 
cannot be said to result from a discourse reduct Ion process, but are 
Instead essentially hotophrastic- 

«n sonmory, elllptSc utterances may be considerably shorter than 
their hbnelliptU paraphrases, but offer no evidence for less wcl 1 
developed lanioage proficiency. Conversely, some very short elliptic 
utterances indicate ability to comprehend and use dialogue 
appropriately, thus showing greater language proficiency than some 
equally short, but non-elliptic utterances. Clearly, this phenomenon 
confounds the use of utterance length as a measure of language 
development. Any measure based on length must either control the 
relative frequency of ellipsis In the sample of language analyzed, or it 
must provide a means of weighting utterances of the same length 
differently In order to compensate for ellipsis effects. 

2. Discourse function . Other aspects of discourse structure and' 
function have similar confounding effects. The function of a particular 
turn, or the function of the immediately prior turn in the discourse 
both irifluencc what information is exchanged, and thus also Influence 
the length and complexity of the utterances. This is appafent in the 
fol lowing exchange: 

Interviewer: Do you like science? 

Child: Yes. 1 like It. 

Interviewer: What else do you learn in your science class? 

Child: About waters 

Interviewer: What about water? 

Child: Water becomes a liquid and the gas, no the air, 
becomes water. . 

The Interviewer's first question Is a request for a specif ic piece of 
Information. It is a yes/no question; ft can be answered appr op 



7 



with d sihgie word, or, as in this case* with e prbnomihal izcd 
rephrasing of the question. The interviewer's second question is 
another request for information, referring to a prior topic in the 
conversation. It is readily answered with a simpie noun phrase. The 
third question is a request for elaboiation oh the child's Irnnediately 
preceding tarn. The child's response could vary widely in the amount of 
elaboration provided, but almost all of the alternatives would require a 
full sentence as a response. The request for elaboration is also an 
invitation for the child to provide an extended response on a topic 
introduced by the child. Utterances in such s discourse context tend to 
be rather longer than responses to requests for specific information. 

Att^flP^ to^r^^s&lve dlscoorse effects . Some researchers, 
frustrated b^ the instability of MLU as a measure of language 
devel6pm^nt,\ have attempted to control the discourse context in which 
language Is sampled. This is frequently done by supplying a context, 
often pictures, and by specifying a tasfc, usuaily description or story 
tclliog. What the adul t part icipsnt may or may not do or say is also 
controlled./ These are discourse variables of a higher order than what 
we have considered here. Observed differences across these macro 
discourse variables- ms'y well turn out to be due to differing rates of 
occurrence of the more micro discourse var lab les that are considered 
here. In any event, such things as ellipsis potential and function of 
utterance may be difficult and in some cases impossible to control in 
any naturalistic language use or elicltatidn situation. 

Considered jointly, these three discourse characteristics- 
ellipsis, fdhction of previons utterance, function of utterance— seem to 
differentiate utterance length to a greater extent than does 
chronological age or length of exposure to language in the sample of 
children Ih the longitudinal studies. The' analyses presented here 
provide a way of Incorporating this and other discourse-related 
informetibn Into the nveasurement of utterance, length. This reduces the 
sampling problem and tends to stabilize any length measure based on 
naturanstic conversation. 
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ttethodology 

the- findings reported here result from continaed analysis of thif 
' data generated by NCBR's Longitudinal Studied of Language Development in 
Bilingual Contexts. Some aspects of the particular data sample used 
here were reported In Berdan S Garcia (1982). 

Participants . The sixteen children In this data set ranged in age 
from 3;8 to 9;8 at the time the language samples were collected In the 
Sunwer of 1981. The children are eight sibling pairs from throughout 
the greater Los Angeles area. The home language for these children 
ranges from almost exclusively Spanish to largely English. The children 
are in a variety of regular and special Instructional programs In their 
schools. Characteristics of the children, their homes and their schools 
are detailed in Garcia et al. (1982). For the lorigi tudlnal study, these 
children are visited by a bilingual fieldworker monthly In 'their homes 
and in their schools. Each session is tape recorded. 

Elicita^n^r^edures . The data for this analysis were elicited 
using the picture description and story telling tasR of the Basic 
Invento^-y of Natural Language {B4NL) {Herbert, 1979). Following the 
generaOprocedures for the administration of the BINL, the children were 
aslced to describe various pictures which they selected : from a set of 
culturally diverse color pictures. The session^ were conducted In the 
children's homes; "both focal siblings were present throughout, and were 
encouraged to interact during the course of the session. In some cases 
there were also other FaSiily m«nberS or neighborhood children present. 
The sessions were tape recorded and subsequently transcribed. The 
fieldworker directed that for each child the task was to be done first 
in English and then in Spanish. The goal of the session was to elicit 
at -least 50 utterances in English arid In Spanish from each child, l^n 
s«w.cises this was not possible, particularly in the weaker language of 
the younger children. Three of the children did not produce even ten 
utterarices in Erigl Ish, and were excluded from the analysis of the 
Engl ish data. 
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Cod log . Utterances were extracted from the transcription each 
language in accordance with BINL procedures specified In Herberf (1979). 
The lists of extracted utterances were then sent to the BIHL publisher 
for machine scoring, in that process the utterances were edited to 
exclade sentence partials that do not conform to the BINL definition of 
utterance.' Words borrowed from the non-test lahguaae were also 
excluded from the count. Utterances were then scored for length In 
words and a "complexity Index" was calculated. Results of. the BINL 
scoring are reported in Berdan 6 Garcia (1982). 

In addition to being coded for length and language (English or 
Spanish), each utterance was subsequently coded for several discourse 
related characteristics that seemed likely to relate to Ungth of 
utterance. These variables represent a variety of at tributes of , 
language or ; comnuni cat ion. They range from character 1st U» that will 
generally be considered discourse characteristics, including the status 
of the speaker, and the function of turns In the conduct of the 
discourse. They also include more di rect ly syntact ic measures* such, as 
the number of clauses in ah utterance, and a measure that Is syntactic, 
but dependent on the syntax of the previous turn In the discourse, 
ellipsis. This set of variables and the values by which utterances were 
categorized are liited below. They wi 11 be referred t5 here ^ 
collectively^ and somewhat loosely, as discourse variables. They 
included the foil owing: 



hhe BINL is scored in terms of mean number ."^'•Jf.g*;,'""!^"^'' 
sample and the complexity of the language used. E .m.nated from the 
word count ar.e things such as^repetitions, corrections^ f I Icrs, and 
words substituted from another language. Borrowed words, i.e., 
vocabulary i ncorporated- from another language, are counted, as are 
proper names. Contractions are counted as two words. 
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i. 



E!T }psts > This variable Is coded for ^ethef the cfilld*s 
utterance 1% an cllfptic form of the prevloos relevant 



utterance: 




(bRt ellipsis possible) 
d- xm&!tj a^nRt appropriate (by the rules of discourse) 



2. OiSc< 



Jtheturn. 



3. 



DlscmH^se^^ c^^ctf t^^^^^^ previous turn > The two variables 
coding discourse f^inc^Jon used the swme set of values* based 
Qfl InformatlpRfll content and the effect of the utterance In 
the iHteriGllon* It is comparable to other function 
classification systens (e.g*» Sinclair and Coulthard, 
<i^0^kiJk>re, 1979:35^-355; Peters^ Ostmah, Larsen, S 
Q^^mor , 1982) • ' - 



c» 



d* 



Agr^Membnt or dlsaoreentent response ,^ An utterance 
which contrad^'cted or agreed with what vas said in the 
I ist^ reliant utterance* . 

Request for spec>f 8c - In^ A quesllon which - 

required i^^^^^ or conjecture from the next 

speaker*" * ' ' ; - , " 

Etaborattloh . Ah utterance which advanced the 
narrative or added new Inforitiat Ion to a previous point 
in if he discourse* 

Information r^SPons# > An utterance which provided the 
specific factual or conjectural Information requested 
by the previous speaker* 

i . ,. " y 4 ... .... _ _ _ 

Sb^lcll#t4^ v utterance which was procedural In 
aStyri; selecting thf next speaker or otherwise 
indicating that the next t*rrn be taken. 

Request for ^4w"if teat lon> An utterance which 
Trequtred the speaker of the previous relevant turn to 
clarify or repeat what was said In that tiirh* 

Sl^i^ getter * Ah utterance which- was 

e rthe{- a b|d'f or t he f loor , joke * or other means of 
getting attention In the Interact ion* 
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Evaluative remark . An utteranw which gave positive 
or negative reinforcement to the last relevah^ 



J* 



utterance* 

Clarification . An utterance clarifying What the 
speaker or another speaker had saia previously. 

Request foi^felaborat f on > An dttcrance in which the 
speaker asked or otherwise tndlcatecl that another 
speaker give new Information about the toplcy under 
dl scussrbn. * / 

Syntax of the uttei^ance > / 

5, Syntax of tHe^&f^v4ous^turn > Syntax was grossly Indicated as 
the number of clauses. 

_._»__.____ 

a. Less'than one full clause 

b. One ful I clause. 

c. Two clauses* 

Three clauses or more. 

6. Sbeaker-^o^he-oi-^vlons turn . ThW variable classified the : 
speaker who had the last relevant utterance In the interaction 
to which the Chi Id's utterance was a response. 

a. 51 b 1 f ng of thes child. 

b. Beer of the child. 

c. Usual f ieidwbrker . 

d. Companion fieldworker . 

e. Adolt relative . ' 

' \ \ 

f . JI& relevant previbus turo. 



the ofte-wiy tabulations on each of these var jibles shdw tinat the 
•distribution utterances across is highly sRewld (Table 1). For 
example, for Function of Turn, there were 58i| examples of Elaboratioji, 
but only six examples of Request for Elaboration across ell sixteen 
children. Cross tabulation' of all variables produced numerous empty 
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cells* Becaase of this, the riiHhber of catfegorles was reduced for each 
variable by conflation of logicany slmi l|r categories, as also shown in 
Table U This recodlrig formed the basis #^r the analyses done^here. 2 

Excerpts of approximately ten pages were then selected from the 
transcript of each session Jh each language* which contained an average 
- of ^5 turns per focaT child. Thesf were ;subml tted to f Ive bi lingual 
*'judges^' who were asked to assess the overall English language and 
Spanish proficiency of the chUd on a scale of one to ten. 

Results 6f Analys^is o^4Xiscoarse Vaflables 

The questions underlying this study, and the nature of the data 
set, require several different analytic approaches. We want in the 
first place to determine which of the discourse var iables relate to 
differences in utterance length/ We then want also to^^eparate out 
effects on length due to di scdurse context , from the more general 
utterance length characterization of each child in both Spanish and 
Engl ish. 

Natural language data sets tend to be plagued by a number of 
distributional characteristics which call in question the \ 
appropriateness of some conventional stat i st ical procedures , such as i 
analysis of variance. In the present case the chief problems are the 
numerous empty cells and ?he grossly unequal number of observations per 
cell. The variables which seem to be most meaningful In differehtlatlhg 
the effects of discburse context cross classify In such a way that there 
is extrat^ely low probability of occurrence of utterances In some cells, 
while In other cells, utterances occur at high frequency. 



2|t may be noted In Table 1 that In recoding^ the values ^ 
Elaboration and Information Response Were not conflated for Function 
Turn but were fof Function of Previous Turn^Elaborat and 
Information Response are thonselves quite different with respect to 
length, but do not seen to dlf f erefit lal ly affect the subsequent turn. 
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We'usethe fbiibwlhg strategy to attempt to cl rcumveht these : 
prbbieitis. First each of the six 1 ihguistrc varlabies is trea 
ieparately^ one at a time by anaiysVs of varlaScfei^v 
effect for any of the variables suggests that it riiay be ignored in 
consideration of the overall rel at lonshi p of discourse context to 
length. Because the; variables are analyzed separately, however^ there j. 
7s the pdssibiity that any one of them may simply' lie a recodirig pf some 
other variable, e.g., an effect related to Request fbrlnformat !on-as -a - 
Function of the Previous tarn may be i ndi stingui shable from an effect 
related to Information Response as a function of the turn itself. Thus 
these analyses of variance allow the di scard of irrelevant variables, 
bat do not identify redundant variables. Tb.do that It is necessary to 
lobti at the remaining discburse variables simultaneously.^ We W two 
separate procedures to do that: multiple regressibn, and a multinomial 
maximum llkel i hood est irriate.- Fpr each of these (but not for the 
analyses bf variance) data are aggregated across children, the multiple 
regression, bf coarse, reduces essentially to analysis of variance, but 
is the more conventional form fqr the subsequent ca leu 1 at ibri of , 
residuals of length in each discourse context, a cal cul at ion which will 
subsequently be used as a weighting device.; 

The maximum likelihood estimates provide paralleT Information, but 
with sotnewhat different assumptions, as detailed below. Rather than 
-just predicted mean lengths for each discourse context^ the Sul tl nbmial 
prbcedure yields expected frequency distributions of al 1 Ungths for 
each discourse context. This difference is a shift from estimating 
length tb/est imat i hg frequencies; of occurencje of each . 1 ength.' This 
overall apprbach o7 successive analyses is similar in miny respects to 
that usid earlier for analysis of primary 1 inguistic variables in Berdan 
(1975) and- in.Garcia (198l). ' . 

" Analyses of Variance . For the analyses of variance* children were 
grouped according to piobl grade level : Preschoolers,- First Graders 
and Third Graders, iftterances for each child were igrouped separately by 
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table 2 

Hean Length of Utterance In Enqlfsh and Spanish 
by Discourse Variable 



Ellipists e • k 
■ ^ ;. Enalish Spanish 



3.T0^ - V -' 3.85 
A. 36 ■ 5>28 



Ellipsis 
No Ell ipsis 

Function Of turn 

RequeM ln^o'''"^t'°" ^'^t 't'll 



Elaboration 



5.36 5.86 



information /Response " 3.26 3.66 

Attention to Interaction / 3.ob '♦.u^ 

Number of Clauses 

Less than': 1 2.5? 2.32 

Equal to 1 5.3 ■ J'J^ 

Hore than 1 3.51 2.^^z 



Furic t i on of Jt-^jitLous Turn 



3.3it *"3-7^ 
i».88 5.93 



Request Information 

Prompt . _______ ^ ^-^1 

Elaboration Information ^.Oo *^Oi 

Attention to jhteractidn ^.13 ^.I5b 



Speakei^f-tiie Previous Turn 



, Fleldworker 
Other adult 
Peer 



^.09 5.n7 

3:17 : : ^.39 



i».58 3.89 
number of Clauses of Previ ous^-TurJ^ 

Equal to 1 • ^-^9 f'H 

More than 1 ^•70 ^-2^ 
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ianguige, English or Spanish. Then each of the remainih§ six discourse 
variables was treated In turn in a;grade x language xdiscourie variable 
design. For each analysis, utterances were reclassified by 'this levels 
[of the discourse variable in the analysis. Mean MLUs across grade ; 
levels are shown for each discourse level of each of the variables in 
table 2. 'The significant effects (p < .05) for each/of the s\k analyses 
are summarized in Table 3. 

Table 3 

Summary of Significant Effects r[i ^< ,Q5) for Six Analyses 
of Visriance (Grade x tanguage x Discourse Variable) 

VBRI«BLls44ev»iti^— MEAHSOUAM "'^ ^ 

"'"'"'Mhpsrs'ur; ^ 27.363 \ 36.-0 .000, 

26.80,. 2,14 5.29 .0225 

Fu«tlon'(-.) 23.283 ' '3,36 . f0v25 .0000 

ANALVS,S.3y«« or CLAUSES ^^^.^ 



,0000 



ANALYSIS it: 

FUNCTibN dF PREVJOUS.TURN , , ^ h ,o i nc riii;? 

ANALYSIS 5: 

''''1^^:11)°"''"'' 3.010 2,24 -4.30 .6254 

Language (i) x Speaker (3) ^6.788 2,24 5.66 .0097 

■ ■ --- \ ^- . 

ANALYSIS 6: : \ 

NUMBER OF CLAUSESl PREVIOUS TURN 



Five of the six discoarse variables show significant effect on the 
length of btterance: - ■ 

E nips is _ 
f Untt ion of turn 

Number of clauses ; 

Function of the previous turn . 

Spealcer of the previousfe turn 

The sixth variable, Number of Clauses of the Previous Turn, showed no 
significant effect. As shown in> Table 3, there is an effect for grade 
level, but only in the analysis by Function of Turn. In none of the 
analyses is there a significant main effect for language. There are 
several interactions, but for none of them are all of the relevant main 
effects significant. Inspection of the cell means suggests that there 
is some tendency for shorter utterances by preschool children, 
particularly in English. In general, the preschool children did not use 
multiple ctitjse utterances in Engl-ish. However, given that the six 
ahalyses are simply reel ass.i f ications- of the same data set, grSde and 
language effects, which, are not consistent across analyses, are highly 
suspect. ° _ 

These analyses lend strong support to our general cSntentron that 
discourse context is an important intervening variable in the 
interpretation of the relationship of MtU to ^language development. ^ 
Anaiyzing the discourse variables separately, however, entails the 
possibility that one or more of the observed effects is nothing more ' 

'than a re-labelling of some other logically prior effect. ^Considering 
the discourse viHablei simultaneously in this data set, however, 

Sntroduces the distribu1:ional problems referred to above.; For this 
reason we turn first to multiple regression, and then to a maximum 
likelihood procedure!.. ' 

M0?flple R p qrpssibn AnjlWs. Ir/ the multiple regression analysis 
we wish to look at -the effects 5f al l/ var i abl es simul taneously. In 
order to miintain the size of each cill, we aggregated the observations 
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across speakers^ The two syntactic variables, Number of . Clauses and 
Rumber of Clauses In the Previous Turn, were not used In this analysis, 
the analysis of variance showed no effect on length for Number of 
Clauses of the, Previous TuPS. Number of Clauses for the child's own^ 
turn did show a significant effect (Analysis 3). However, Humbei- of 
Clauses Is not, strictly speaking, an independent variable, but ah 
alternate to number of words as a representation of Jerigth of utterance. 
It correlates highly with length measured In words (r - .75), and 
virtually precludes showing any other significant effects in multiple 
regression. - 

The four rdnalnirii discourse variables: El'l Ipsls, l^uhctlon of 
turn/ Function of Previous Turn, and Speaker Of Previous Turn; with 
Language as a fifth, are the Independent variables used for the multiple 
regression. For regression, each pf these Variables Is construed as a 
nominal scale. The values Of these scales were duftimy coded according to 
the procedures outlined' In Cohen & Cohen (1975:175 ff.) In this coding 
each.value becomes a dichotomous variable; each utterance Is scored 
according to "presence" or "aT,sence" Of each value of each variable. In 
tfils scoring, one value for each variabli becomes in one sense 
redundant, and in another s en Se becomes a reference point for comparison 
to the other values. This status was giyen arbitrarily to the 
"Attention to Interaction" values for the Function variables, and to the 
"Other Adult" value of the Speaker of the Previous Turn. Ellipsis and 
Language were already dichotomous variables. 

The results of the multiple regression analyses are shown first for 
each discourse variable separately! analogous to the analysis of 
variance treatment. For these, represents the proportion of variance^' 
in length attributable to each Variable alone, when the other^varlables 
are not considered. IFhese are shown m Table if. Values of R range 
from .007 for tanguage\ to .150 for Function of Turn. All are • 
significant beyond Ot.- .01, by conventional F tests. Rowever, the F 
ratios should not be Interpreted literally, since the degrees of freedom 
for their denominators represent al 1 utterances aggregated across all 
chi Idren. . : • 
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Table U 



MdI ti pie Regressions for Each_Discourse Variable 
Treated Separately 

IRDEPENDENT VARIA&U BETA MULTIPLE R ftj 

ELLIPSIS , . .339 .115 

El 1 ipsis -•333 

FUNCTION OF TURN .38? .150 

Request -^Informat Ion .05^ 

Elaboration . \ .305 

Information Response -.119 

■ ■ . ■, , . - ------ ■ -''-i- 

SPEAKER OF PREVIOUS TURN .303 .092 

Fieldwbrker .^^1 

Peer .255 

FUNCTION OF PREVIOUS TURN \ .35t> .127 

Request Information -.23^ 

Prompt .165 
Elaboration/Information -.02b 

LANGUAGE * 

Engl ish .082 



.082 .007 



The sets of dumny variables represent ing. eacK di scourse variable 
Were then entered into the regression step-wise, in the order that they 
are listed in table This ordesring was not b|sed splctly on logical 
precedence among the variables, B^Jt on the relative magni tude of effects 
in the analyses of variance, and on the. ease with which the variables 
can be coded. This latter cons-ideratibn confoands other orderihgs, but 
Is of considerable interest if this procedure i« to have , pract i cal 
application. In particular, it is general Iy eas ier to identify the 
speaker of the previous turn" than it is to code the discburse function 
of that turn. Other order ings than the one presented here merit 
consideration. In Table 5 the values of Beta are given for each dummy 
variable, for each step that variable is included in the. regression 
equation. The resulting value for R2 and the change In R2 for each step 
are also shown. ■ , 
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Table 5 

Step-wise Hoi t i j)! e Regress I on 



VARIABLES 


STEP 1 


STEP 2 


VALUES. OF BETA 

— — - d 1 tr^ 


STEP 4 


STEP 5 


ELLIPSIS 


-.339 


-.225 




-.213 


-.232 


FUNCTION OF TORN 
Request Info 
Elaboration 
Info Response 




.273 
- 005 


.283 
.046 


.017 
.236 
.057 


-.004 

.223 
.048 


SPEAKER PREV TURN 
Fieldworker 
Peer 






.285 


.269 
.251 


.248 
.244 


FUNCTION PREV TURN 
Request Info 
Prompt 
E lab- Info 








.037 
.132 
-.op 


.021 
.127 
-.021 


tANGUAGE 










.120 


r2 


.115 


.177 


'^^^ .218 


.227 


.240 


R2 Change 


. .115 


.062 


.041 


.009 


.013 


F{R2 Change) 
df 


166.684 
1,1284 


32.481 
3,1284 


33.261 
2,1284 


4.848 
3,1284 


23.036 
1,1284 



erJc 
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At each Step the change in R2 i s s>gnif lcant» p < .01 . The 
step-W! se regress (on shows that much of the ! nf 1 uerice of Funct Ion of 
Turn can also be accounted for by tl HpsiSi^^^^^ O^^^ 
PirevJ bus turn Is identified, less than II additlphai y^ 
•cioonted for by also including the Function pf the frivibus Turn. This 
sin^n ihcre^^^ is In spite of the relatively large ^R^^ of 
the Prey ioas Torn {R2 « i127) when that variable is const dere^^ 
isoTa^^^^^ The contribotion of language is extremely small in either 
casci but is slightly increased when the effects of the other discourse 
var labl e^ are o I so cbhs i der ed. As coul d be expected, the effects of the 
discburse variables are neither fuMy independent nor completely 
redundant. > Total R2 1s dii^t, suggest ing that about bne-fourth of the 
total variance in utteraricc length can be accounted for by 
differentiated discourse contexts^. 



HmOmim 1 ike! ihood estimates . An alternative means of considering 
all of the discburse variables simoltaheoosly is provided by max i mom 
iikeVihood estimation. For this approach, utterance^ are tall ied, and 
the data of interest are the frequencies with which utterances of each 
length occur each discourse context. The distribution of lengths 
Wi I I here be regarded as i fmjltinomlal function. Estiirwtibn of such 
muttinomiaU by maximum I ikel ihood is discussed by Edwards (1972). 

Sociol ingulsts who have used quantified approaches to the study of 
natural jangQage/varlabn ity hav^ long been frustrated by the 
distributibnaJ characteristics of most language varlablesrw many 
contexts of great interest occurring naturally at very low frequencies 
{cf. Labov, 1?6t>) . The use of maximum likelihood techniques for 
-estimating e/f-ects of linguistic environments was intro^duced by 
Cedergren (1973) and Cedergren S Sankoff (197^)- The several models 
which these researchers estimated by maximum likelihood have more 
recently been replaced by a model based on logistic transformation (Cox, 
1970; Xindsey, 1975) of prbport ions (Rousseau & Sankoff, 1976). 7M\ s 
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model has been extended to the description of polychotoinoos variables by 
Jbrics (197i>). the introduction of this latter model to the treatment of 
language var i at ton" fol lows the development of the necessary computer 
progranming by P. Rousseau at the University of Montreal;3 

terminology In the related sbclol ingulstic 1 iterature divergesfrom 
that of other statistical treatments. "Factor group" and "factor" are 
used anirogously to the analysis of variance. terms ■'variable" artd 
"level," respectively. Within a factor group, factors give aruutually 
exclusive and exhaustive characterization of al 1 observations- Any 
given obsef vat ion is defined by one factor from each factor group and, 
in this case, By a category corresponding to length. 

to apply this procedure here, we model length of utterance as a 
multinomial function, classifying utterances by length words. In 
oraer to reduce the number "of parameters estimated in the multinomial 
and to balance the data set, length was receded into the fol lowing, ten 
categories: 

Length Category 

. LI IZ L3 L4 ^ L6 L7 L8 LB LIO 

NO. of* words 12 3 5 6' 7 8-9 lO-'ll 12-31 

This results In a compression of the upper end of the length scale, but 
af f ects \a relatively small proportion of the data set. 

In a sense, treatment as a multinomial degrades , the information in 
the data set, since length is treated as a nominal rather than interval 
scale. However, the treatment of length as ah ordinal scale is itself 
troublesome, since longer utterances do not result singly from 



3we are grateful to Dr. Pascale Rousseau for providing us with a 
copy of this program and related documentation. 
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jhcremehtlng shorter utterances v but from a complex change in the 
s«naf.tic arid syntactic integration of Informatibri. Whatever limitations 
there may fae in the linguistic or cognitive interpretation of a'^' , 
mujtlnomiai model Of length, other models of this s^condory langoege 
meisure seem at least equally opaque, and do not share many of the 
useful properties of this approach. . 



Thus> data are characterized as a fre^^ "I"*** 
:.?>^qaenc?w which observations fall into each of the cells is defined' 
by one factor from each factor group, and by the f el evsnt length 
category. The cells form a matrix of n.+ 1 dimensions, where there are 
h factor groups. It is convenient* however, to represent the , 
observations ss a two dirt^nsional matrix, with the rows defined by the y 
actually occur r i ng combi nat Ions of factors, and the col omns def ined as 
the lerigth categories. The actual number of rows in these data is 
considerably less in each of the analyses given he^^^^ the possible - 

number of rows from the cros^ classification of all factors In al^.^ 
factor groups. A row in the matrix contajns all observations for an 
occurring combination of factors. A column contains .gl I ob^fef vat I ons of 
^ particular length. The first f,ew rows of the frequency distribution 
for Analysis 1 below are given in Table 6. 

Table 6* 

Sample Data Display. Number of Utterance^'by Factor 
. and by Length Category 



Factors 



Factor Groups 
1 2 3 . ^ 5 




1 


Z 3. 


Length Categories 
k j; 6 7-8 


9 


la 


1 1 1 11 


. Number of 


0 


0 1 


6 


0 0 • 0 


0 


0 


0 


1 I .1 2 2 


' Oberyat ions 


0 


0 3 


2 


0 0 0 


1 


8 


0 


11 1 3 1 




62 


32 1^ 


9 


6 1 2 


0 


0 


0 


1 1 1 3 ' 2 




0 


1 5 


io 


5 1^ 9 


6 


1 


. 0 



IT 



•■1 

'fir* 



IS 



''I 
4" 



2U 



The object of the maximum 1 Ike 1 i hood procedures 5s to derfve a 
iBuUlnomlat f unction for etf^h factor in each factor group* and to 
calculate the iifee44M^ thtat Sanctions adequately characterize 

the data set. Whether or not the use of addltlonai' fac'tsrs 
significantly In^rbves the cHaracterlzat Ion of the data set may be 
tested by comparing likelihoods across.'- solutions. Also, from the 
functions def In I ng each factor it is possible. to 'estimate the 
probability of occurrence of utterances of each length category in each 
discourse context. 

Prom the frequency matrix, a set of coefficients is calculated for 
each factor. Addl tlbnal ly, In order to force a unique sblutlon, one 
other factor Is calculated which applies In common to all rows. This is 
referred to In the s'bclol f rigui st Ic studies as the "Input factor"; 
elsewhere a^ "overall effect," or a "constant." These coefficients are 
calculated so that the probability for any celt is the sum of the 
logistic transforms of the n factors defining its row, plus the logistic 
transform of the overall effect.; The numerous desirable properties of 
the logistic transformaif^n fo^^ with data In the form of 

proportions are discussed in Rousseau* and Sankoff (1976), and in 

Hof acker (1982). 

The meximum likelihood procedure uses a nonl inear Iterative 
technique in which all coefficients are estimated simultaneously. The 
procedure Is iterated 'unt 1 1 It converges on a maximum valu^ for the log 
likelihood (Jones, 1975), or until Itbecomes apparent that convergence 
is not possible. For the analyses below which failed to converge under 
this algorithm* we report the probability coefficients from the 
Iteration with the hllhest log likelihood. 

Haximom liJkel Ihood could be estimated using all the discourse - 
virTiBTe$-tradT0T7t« ^P») r r wn a wwn g a44--b^-the 

possibilities we have sampled two combination of five factor groups, 
two combinations of four factor grcwps, and finally, we have split the 
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data set by lah§uage, and run separate estimates for each language set 
using three factor groups each. Each choice of factor groups Is In 
effect a hypothesis of the underlying dimensions of the data set. The 
factor groups used to define the data set In each run, and the result I 
log llkeiihoods are shown In Table 7. 

^^^^ble 7 

Factor Groups Used for Each Haximum Likelihood Estimate 

FACTOR GROUPS LOG LIKELIHOOD 

ANALYSIS 1: -2263-15 
Language 
Ellipsis 

Fuhctibh t>f Tprn , . 
Funct it^n of Previous Turn 
Number of Clauses 

ANALYSIS 2: -26^3 .56 

Ldhguage 
Ellipsis 

Function of Turn ^ » 

Function of Previous Turn 
Speaker of Previc/us Turn 

ANALYSIS 3_:_ / -2682.03 

Language * 
Ellipsis 

Funct ion of Turn 
Function of t'revious Turn 

ANALYSIS k: : -2667.16 

Language 
i Ellipsis 

Fonct ion of Turn 

Spea)<er of Previous Turn 

ANALYSIS 5a? (English dat?i only) * -1303.92 
El I ipsis _ 

Function of Turn 

Function of Previous Turn 

ANALYjiS 5b; (Spanish data only ) -3 136^88 

Ellipsis 

Funct ibri of Turn 
Function of Previous Turn 
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In spite of the difficulty of InterpretTrtg-Jltimber^ Clauses as an 
IridependBnt variable we retained It In Analysis- 1, but noTSfl^the 
iubsequent treatments. Analyses 3, and i« alternate between Function Of 
the Previous Turn and Speaker of the Previous Turn. In each of these 
estimates* Language (Spanish and Engl I sh) , was also maintained as a 
factor group, In spite of the Allure to show a language effect In the 
analyses of variance. For our ultimate purposes of modeling J^nguage 
development the appropriateness of aggregating across languages H open 
to serious question, and we prefer to demonstrate similarity (or 
difference)^ across languages, rather than as same It. For Analyses 5a 
and 5b, the data set was spilt by language, and separate estimates were 
ran for each set. Otherwise these are Identical In factor groups to 
Analysis 3. For each of the analyses, utterances are aggregated across 
speakers. 

Comparing AlWnatlv^ AnaJyses. Kaximum likelihood solutions are 
generally evaluated in two different ways. One approach, which might be 
termed absolute , cwnpares how well probability or frequency 
distributions estimated using the "maximuin llkellhlpod solutions fit the 
observed frequencies. -Usin^ cumulative frequency displays. Figures 1-3 
show, observed distributions and those- predicted by the maximum 
likelihood estimates for the first three rOws of data given In Table 6. 
Conventional measures of goodness of fit, par t IcalarlyO^* are difficult 
to Interpret for these data. Where there are many cells with very low 
expected frequencies, and no observatior>^ at all. In many of the possible 
cross-classlflcatlons of factors. 

'An alternative to such absblate tests of goodness of fit are 

, B- - ____ _ _ 

relative tests, comparing alternative solutions. Solutions are 
preferred which both maximize the log Mkel,lh66d, and minimize the 
nimber of parameters estlmtaed from the data set. This Is tested by 
cuH i p u rn,^ i wlLi the -difference In log I ikeHheods-for-any Lwu sulutlonB, 
and compart ng that flgure'^wlth the ^ distribution, with degrees of 
ffwdan equal to the change In degrees of freedom between the two 
sioiutlons. 




Figure , l-S^'ijf'f/lb^f, "itU^tir^U" 

, . *„i?y1T5a°"E.UpsIs. request Infor^Uon. ri,=est .nfor^atlon. / 
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T 
6 



7 
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Figure 2. 



Cumulative frequency distribution 1%) factors: . . i*^^ 

An^ly'U^^ ellipsis, request information, elaboration. 

An^l^sU'^r^Englis^ ellipsis, request Information, .la^ 
Sn^Wsis 5a: Ellipsis, request Information, elaboration. 
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By this criterion, first choice amdhg the sojutiohs Is Analysis 1, 
Incorporating Number of Clauses as a factor group. Its log likelihood 
is significantly greater; than titat of any other solutions. Analysis 2 
Is significantly better than Analysis 3 (0^2 (2,9) « 79.9^)-» but hot 
better than Analysis k (^2 (3,9) « 23.60). Splitting the data set by 
languages in Analyses 5a and 5b nearly doubles the numben of parameters 
estimated, but does not significantly impjrove the log likelihood ( "X^ 
(6,9) * 41.23) over Analysis 3- 

With the multinomial approach used here it is possible to examine 
the effect each factor across the length categories. The coefficients 
estimate for each factor in each analysis are given in Table 8. They 
are also shown graphical ly for Analyses 1, 3, 5a, and 5b, In Figures 
A-7, respectively, eoefflclents of 0.1 ('the reciprocal of the number of 
categories in^the multinomial) have ho effect;- larger coefficients lead 
to estimates of higher probabilities, smaller coefficients to lower 
probabilities. 

The three different treatments of the data set shown graphically 
yield generaHy different patterns of coefficients for the factors, and 
some patterns that seem quite cons isi:eht across treatments. In Analys is 
1, Overall Effect shows three separate highs^ which seem to result from 
classifying the utterances by humber of clauses. Iii Ahalysis 3$ Overall 
effect is a generally decreasing function of length, roughly comparable 
to the overall frequency distribution of length. In Analysis 5a and 5b, 
where the data set Is split by language, the general shape of Analysis 2 
Is maintained, but with opposite effects at Lengths 3 and 4. 

LanQuaoe Factor. The overall shape of the effects for the Language 
fadtbrs is the same. in Analyses^ and 3, but of considerably greater 
maghltude In Analysis 1. In both cases, tengths^1-3 are more probable 
in English than In Spanish, with the opposite effect at the high end of 
the length scale. Although language seems to show afairly small 
effect, particularly when number of clauses is not considered (Analysis- 
3)^ the separate analyses in 5a and 5b show quite different effects for 
some factors w 33 



Tabl e 8 : Coef f I c i en t s for Factors 



Eiliptil 

Ftinctibfi of previous turn 
rimctlbfi of (urn 
ifntM 

iMAnh 3 to 



1 

1 




h net 


2 




n~ I7fl 


1 
1 






2 , 






I 


■ 


' O.OjS 


2 


■ 


Q.P55 




■ 








0.109 




* 


0.079 


2 




0.063 


j 




0.128 


k 


m 


\ 0JJ9 


j 


m 


\ o;o86 


2 


u 


0.004 


) 


u 


- - 0.01 1 



0.0(1 

0;|ii7 

Q.IO^ 

mi 

0.08^ 
0.122 

b.jiz 
b.ido 

0.OS9' 
0.122 

o.ios 
6:686 
O.IH 
0.000 



0.06S 

Q.ue 

0.096 

0.099 
0.127 
0.07^ 

o.d7t 
0.128 
0.118 
0.097 
D.08S 

o;o90 
0;0I6 
0.02^ 
0.010 



0.108 

o.oj} 

0.092 

b.lo6 

0.159 
0.079 
0.071 

o;o9i 

O.IOl 
0.099 
0;057 
0.157 
Q.193 
0.676 
0.000 



0.083 - 0)006 0.152 0.005 0 



0.097 
0.092 
0.096 
0.102 
0.097 
0:06$: 
0.098 
q;I07 
0.167 
0.08^ 
0.083 
0.075 
0.005 

0.030 
' 0.026 

0 



0.127 
0.071 
o.iu 

0.087 

0:091 
o:IOk 
0.120 
0.072 

0.113 
o.uz 

0.088 
0.076 
0.003 
0.025 
0.052 
122 0. 



0.109 
0.082 
0:130 
0.075 
0.072 
0.083 
Q.IO8 
0.13b 

b.b7j 
0.125 
0.089 
«.n3 

0.002 

0.017 

0.106 



0.130 
0.069 
0:070 
0.139 
Q.082 
•0.112 
O.ll^ 

b.bsi 

0.085 

b.i2l 
b.io2^ 



-o.bbi 
0.107 
0.115 



139 0.088 < p 



5.1*7 
0.061 
o:oB7 
0.112 
0.076 
b.2*b 
0.059 
0:078 
b.b7b 
o:ji6 
b.153 
0.071 
0.001 
0:009 
0.351 
.033 0. 



0:102 

Q.OBS 
Q.0S9 
0.109 

o.in 
0.09^ 
b.088 
b.b93 
b.d96 
b.ibS 
b;o9* 
,0.091 

10.003 

o;oos 
0.259 
221 



0.063 0.075 

0.139 b.i27 

b.i89 0.H5 

■ 0.0(6 0.060 

0.107 e:o8t 

0.052 0.b72 

■ 0:159 0:129 
0.097 'OHIO 

■ 0.088 D.l)8 

■ ' 0:056 0.056 

. 0.131 o.m 

0.119 0.096 

0.173 b.isi 1 b. 



0.072 

0.132 

O.IOi 
. O.OBt 

0:123 

0.076 
0.170 
0.131 
0.131 



o.ia 

0.080 

b.bss 
0.158 
0.085 
o.llS 
0.109 



0.12* 
0.077 
0.065 
0.13* 



0.083 

0.093' 
15* 0 



0.113 0.097 0.125 0.10* 

0.085 0.099 0.076 0.092 

0.091 0.085 0.095 0.1 II 

0.096 0.103. 0.092 0.079 

0.1*6. 0.092 0.097 0.073 

0.086 0.090 O.lll 0.086 

0.073 0.09* O.IIO 0.103 

0.09* 0.109- 0.072 0.133 

O.ll* 0.182 0.119 0.071 

0.076 0.066 0.095 0.109 
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Figure i*.:!. Overall Effect (Analysis 1). 
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Figure ^.2. Language (Analysis 1) 
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Figure 5.1. Overall effect (Analysis 3) 
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Figure 5*2. language (Analysis 3) 




Figure 5,3. Ell ipsis (Analysis '3) 
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Figure 6.1. Overall effect (Analysts 5a, English) 
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Figure 6.2- Ellipsis (Analysis 5a, English). 
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Figure 6.3- Function of previous turn (Analysis 5a, English) 
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Fifure 7.2. EllSpsSs {Analysts 5b. -SpafUsh) . 
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Frgure 7-3. Function of previous turn (Analyslsjb. Spanish) . 
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factor s. The ellipsis factors show the expected effect 
that short utterances are n»re \\ke\y to be elliptic than non-ell ipiic, 
and that the opposite relationship holds at the high end of the length 
scale. There is essentially no effect In the itiSd range, the^ e 
effect Is considerably mitigated by, maintaining Number of Clauses as a 
factor group- This Is likely due to the fact that elilptlc utterances 
are al(i)ost always less than a full clausc^and what shows as an ellipsis 
effect, in Analyses 3 and ^ is part a clause effect in' Analysis 1. , 
Splitting the data set by language shows th\e el 1 ips i s effect to be 
greater In Ergjish than In Spanish. { 

1 

Clause facrors . The- general effect of the Clause factors in 
Analysis 1 Is somewhat of a mirror Image of the Overal I Effect. Partial 
Clause shows a positive effect at Length 4; I clause has positive 
effects as Lengths 2 and 4; and more than one clause is weighted to the 
Lengths B and over. 

Ol^eeu^rse functioo . The effects related to funct ion are rather 
more complex. The patterns across Analyses V and -3 are generally 
comparable Jn both shape and magnitude:. Considering the two languages 
seperatply, however, shows quite different effects across the two 
languages. Considering the Function of Previous Turn, Requests for 
Infor«»^tlon and Prompts ;tend to be we i qhted against vt-r y short 
utterances, with Request for Information shSwlng a positive effect in 
the middle lengths, and Prompts a positive effect for longer utterances 
When the previous turn v^as not directly an Invitation for the speaker t. 
give additional Information^ ti»at is, when the previous turn Itself 
consisted chiefly of new information, or was directed to the nature_of 
the Interaction proc<jss, shorter utterances by the target speaker are 
weighted more heavily. 

Somewhat .oppos i te effects are' shown for the f unct i on of - the 
measured turn 'itself. E 1 aborat I on I s general 1y weighted toward the: 
r'-eater lengths, and I nforroat ion Response to the middle range. The 



pattern shown here for the Spanish only data in Analyisls 5b is suspect, 
dde to a very smal 1 number of observations in somis categories at higher 
lengths. 

Separatinq-but^ the Effects of Discourse Context / 

The findings presented here on the effects of discburse context on 
length of utterance strongly suggest that any attempt to relate 
utterance length to linguistic development of children of this age or 
older must either control for discburse context, or develop a system for 
accoimdddt ing the influences of discourse contexts which art^e external to 
the speaker; The analyses presented above provide two different bases 
by which the influences of discburse context can be part iai led oiit . 
Using multiple regression, it is possible to calculate expected lengths 
and show how individual children differ from the aggregate* Using the 
maximum 1 ikeJ ihobd est imates. It is possible to calculate expected 
frequency distributions across lengths, and again show how individual 
children differ from those expected distributions. 

We will consider first the multiple regressipn. Using the B 
coefficients from the fl ha! step of the rtsgressloh we calculate the 
predicted length of utterance for each discourse context, defined across 
the; five discourse variables. The difference between those predicted 
lengths, and the actual bbsei^vat ibn lengjths for each utterance for each . 
child are the residuals . These residuals are conventionally used to 
calculate the error of the regression. Here we interpret tha residual 
as th^ child's contribution to utterance length, once the effects o/ the 
discourse contexts, pooled acrbss all children, are subtracted. For 
each child we calculate the mean residual, and use that value as a score 
for the child. These scores are shown under the column labelled 
"Residuals" In Table 10. 

From the max imam likelihood calculations we take a; rather different 
approach to separating out the effects of discburse contexts. The 



■ i ■ ■ ■ ■ /. ^ i -■ 

maximum liljeiihobd estimates provi"=de prpbab i 1 i 1 1 es| that utterances of 
any particular length will be observe^a i n each diSjCdurse context. 
Actual iy, we are not interested In fftie prbbabi 1 i t|. of occurrence of an 
utterance of a particular lehgthy/but in the probability that an 
Utterance'if ^t. least that part/cular length wil 11 occur i n a given 
discourse context. In this s^se, length is now jinterpreted as an 

- - - . Ti-- --.V y 



ordinal scale; lengths are ordered, but no assumptions are made related 

I- - ' - / ■ ----- [ -- ■ , -- 

to. interval between length's. These probabilities may be readily 

estimated from the cumulative frequency di str ibUjtidh of utterances, with 



frequencies ranked by/length of utterance. The jcumul at i ve di str ibut ion 

-is ihown for this data 'set inJFJgure 8. A set of weights may be derived 

_ yT. ■ . / > _ [ 

frbm these prdbabi 1 i ties by the simple calculat^ion Wj = 1 - Pi> where pj 
represents the/probabil ity of an utterance of ajt least length i , and Wj 
is the^assigri'ed weight. In an information theoVy sense, this may be 
Interpreted as utterances of hTgh probability yielding little 
informatl<>n about the language development of a child, while utterances 
of low probabi 1 i ty yield more information, andj are weighted 
proportionately greater. \ ' . ■ ' 



These probabilities form a scale rather different from a scale 

. 1 _ ' " ~ . _ 

basid on word count. Tlie two scales are compared in Table 9. The 

percentage of change in the assigned value Tor each additional word in 

the utterance is for very short utterances,, milich greater for the 

weighted index than for the integral count. The percent of change 

identical at six words and then tapers off very quickly for the weighted 

index. This has the effect of stretching the^scale up to about the mean 

length of utterance for the children in this simple, arid then 

comprising the up^er reaches of the scale. ijntui 1 1 1 1 vely,., tKI s seems a 

desirable counter to the apparent bias Iritroduced Into; small samples 

(e.g.* the teri utterances of the B I Nt sampl i ngjprocedure) by an 

occasional very long but atypical sentence. 1 . 

This weighting procedure also offers a very Straightforward way of 
incorporating the effects of di scourse context . Weights may, be 
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' Figure 8. Cumulative frequency distribution. 



4B 



ERIC 




■■■■ : 



^3 



Table 9 



ERIC 



Length 


1 

■'1 nfcrement 


Weight 
(l-Pi) 


%. 

Increment 


N 


i 

2 
3 


1.0 
.50 
.33 
.25 


nn 

.17 
.29 
.52 

• 55 


.71 
.55 
.31 


219 
153 
176 
165 
139 


6 
7 
8 
9 
16 


on 

.16 
.15 
.13 


66 
.75 
.81 
.85 
.89 


; '.20 

.15 
.08 
.05 

, , , .■05. :;.,. 


115 
85 

.55 
50 
26 


11 
12 

13 

14 
15 


1 n 

.08 
.08 
.07 
.07 


Q1 

• -7 1 

.93 

• 95 

.'97 
.97 


.02 
.02 

.02 " 

.02 

.01 


: 35 

21 

^ • 25 
8 

12 


15 
' 17 
18 
19 
20 


. u/ 
.06 
.06 
.06 
.05 


98 
.99 
.99 ' 
.99 

.99 , 


.01 
.00 

■■■■ .00 

.00 
.00 


6 
5 
2 

-3 
1 


21 
22 

23 
25 
25- 


.05 
.05 
.05 

.05 ■: 
.05. 


• .95 
1.0 
1.0 
1.0 

1.0 - • 


.00 
.00 
.00 

:.00 
.00 


• 1 
1 
1 
1 

0 


26 - 
27 
28 
29 

: , . 30 


: .05 
.05 
.65 
.05 
.03 
.03 


1.0 
1.0 
1.0 
1.0 
1.0 
1.0 


.00 
.00 
.00 
.06 
.00 
.00 


1 

^ 0 
0 
0 
0 
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calculated independently from the frequency distributions bbserved In 
each di scourse context. Cumulative f reiquency^ distribut ions for el l^fitic 
and nonel l iptic utterances In Engl ish and Spanish rfe graphe^^^^^^ 
9* Prom this graph it may be seen that, in both Engl I sh and Spanish* 
elliptic utterances occur at higher relative frequencies Iri^^t^^^ 
length categories than do nonel 1 iptic utterances. Under the^^^^w^ 
schema suggested above, higher weights would be assigned to e 
utterances than to noneiliptic utterances of the same length, with the 
differential decreasing as utterances get longer. 



jn principle, it would be possible to continue to subdivide the 
data set by the varioui discourse variables that have been cons idered 
here, and derive separate" Weights for each context. There are two 
problems with this,, one practical, the other theoretical. The practical 
problem is again the distribution of observations acrosl particular 
contexts, i.e., cells defined by each comb I hat I oh of independent 
variables, In the data set. The number of observit fons in many of the 
discourse contexts In this sample Is so small that a great deal of 
sampling variation can be expected. The theoretical problem Is that the 
procedure treats all contexts as independent and makes no as sumpt i^^^ 
that there are common factors operating in vai^ylng combinations acro^^ 
the contexts. The coefficients which derive from the maximum 1 Ike II hood 
estimation provide a solutibh to both of these prbblOTSv 
assumes that a 1 Imited number of factors are Independent,- a^^ 
factors rather than the many contexts which they jointly def rn^^ 
the observed differences in frequency d istri bat Ions acro^^^^ 
discw.rse contexts. Under this assumption of independence^ It Is / 
possible to use related contexts for the es ti mat ion^of effects in 



contexts where there are very few observations. 



To. incorporate the discourse effects est Imated-by th^^ 
iikelihpbd procedure^ we use ^ the same weighting procedure discussed 
/above: ^ One minus the estimated probabi 1 ily of an utterance at least as 
long as that observed In any particular discdur^e context. \ 
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Table 10 

Individual Scows on Language Heasures 



EWLISH 



SntNISH 



Cbl M : 
I0» 


itiiri Profie. 


MLli 


Reiiduti 


NtinKlijhtifllixUltillhoodAnil)?^^^^ 

.i_ 2 1 4' 5i 


Nfjc. 
Kiting 


tiLU 


NiinMightitllixlNlhoedAfliljfiei 


tt 


$18' 


nv» iini 

III 


1.1 


2.8S 


.50 


.58 .58 .59 




'70.0 


S.(2 


2.29 


.45' 


.58 


.59 


.58 


.57 


M 


JlJ 


"$2.$ 






.« 


,45 .<7 M 


.46 


29.0 


5.50 


0.82 


-.35 


.45 


.47 


.4( 


.47 


If 


liii 




5.5* 


0176 


.41 


.ir .51 .48 


.52 


5i..o 


5^72 


0.05 


.41 


.*5 


.S7 


.45 




US 


iii 




5.51 


1.61 


.32 


.49 .45 .49 


.45 


51.0 


3.54 


0.90 


.42 


.54 


;44 


.55 


.45 




M 


jt.j 




0.92 


.10 


.44 .4$ .45 


.4( 


54.5 


}.(1 

-.1 


•0.53 


.30 


.35 


.37 


.35 


.36 ' 






r 

3B.5 


(.32 


I.H^ 


' M 


.52 .5* .53 


.53 


t _ 

55.0 


5.73 


0.(2 


.43 


.45 


.48 


.4( 


.41 - 


















48.5 


5.30 


0.20 


.33 


.39 


.41 


.39 


.41 


1 

il . 

i 


?|5 


39.5 


5,8S 


0.(9 


M 


.48 .51 .49 


.49 


4S.5 


3.94 


•0.(8 


.31 


.3( 


.39 


.38 


.38 


JT 




32.0 


4.50 


:n0.08 


.33 


.42 .47 .42 


.^5 


53.0 


■.V. 

10.54 


4.(9 


'.40 ^ 




.72 


.72 


?73 
.30 


■i- vs 




an 


3.12 


•9.(1 


.30 


.35 .29 .35 


.29 


2M 


2.79 


•0.35 


.27 


.37 


.30' 


.38 


M 




55.5 


5.05 


0.95 


i 

M 


.43 .45 .43 


.45' 


20.5 


4.84 


0.52 


.48 


.50 


.4; 


.50 


.(4 


tA 




20.0 


2.71. 


•1.84 


.29 


.17 .17 .17 


.18 


(4.0 




0.92 


.35 


.48 


.51 


.41 


.49, 


















35.0 


2.07 


•2.33 


.17 


.14 


.IS 


.14 


.17 




5ji 














30.5 


2.82 


•1.15* 


.22 


.2( 


.28 


.25 


.21 






IM 


5.38 


.0.14 ■ 


.47 


.39 .42 i40 


.41 


32.0 


9.21' 


3.72 


.41 


.(4 


.(( 


.(5 


1 ' 








-2,W 


-0.(8' 


.18 


.2; .20 .2( 






'•?». 


-0.|4 , 


.18 


.20- 


:n 


^20 





0 



iiiii 7|5 34i2 

s;o; 4j5 ,13.8 



5.17 

l;59 1.12 



.39 



.42 .42 
.10 .12 



.42 ■ .42 
.11 .12 



42.53 
16.28 



ON 



§.37 B.w .J' ."'J ■ ; 'w 

2.33 1.72 .05 .15 .15 :I5 . .K', 
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These weights are, thus, eqajvaient the values of the predicted 
cuSalatrve frequency distributions in Figures 1-3 where the weight for 
iehfth i is equal to the cumulative probabi 1 ity for length i-Vj for all 
1 2, 3, . . . fe. In all contexts, the weight for Length 1 is 0.0. 

Comparison of Weightings . \_ 

From each of th4 maximum 1 ikel ihood est imates, weights based oh. the 
estimated. cumulative prpbability were derived for each sentence. Mean 
weights were then calculated for each child in each language: under each 
of the three analyses (Table 10). these mean weights provide a basis 
for at least an initial test of whether the general notion of weights 
based on frequency of occurrence and discourse context are pf utility in 
estimating language d«2ve1bpment , and !f so, which of the alternative 
-rpp-roa^ches-is-preferable. The problem, of course, is the identification 
of an independent standard of language development, such that the : 
proposed measure can be evaluated. 

No satisfactory standard is available for this data set, and so two 
general indicators wi 11 be considered, both separately and jointly. 
These are age and judges' hoi i st id rat irjgs of language proficiency; A 
"thT7d 7n^dTcat~oTi'lco^^^ T 
complexity scores, was not considered. The correlation of these 
complexity scores with MLU approaches unity (English, r - ,9981 Spanish, 
r = .975) aad they cannot"be treated as an independent indicator. 

Aoe as- an -H^pendent measure o f language development. ! s an 

extremely precarious indicator of lanjuage development for a billngUar 
population, the cMldren in this sampl.e have highly diverse experiences- 
in eac^ of their languages. Some of the children are from homes where 
Spanish was the primary, perhaps only, -language until they entered ^ 
school. Others are from homes w^ere English is the primary language for 
at least some^dyads. Some of the children are in full bilingual - 
programs ir> school; others are' in regular programs, with or without 
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extra instfUctidn In English- Nonetheless, we expect that there will be 
a geqer ally positive relationshfp between age and language developineht 
j n both 1 angaages • 

teiJ^^iic^ ratings of lahqaade prof iciency > the hoi 1st I c rat I ngs" of 
language proficiency were based on approximately ten p^es of a 
transcript ffoiTi each acSnlni strata 

for each sibling pa1r» as previously discussed In the method section 
Five adult judges who are fluently bilingual In Engl Ish and Spa^ 
who had not had direct contact W!th the children In the sttidy, Were 
asked to rate each child's lanfuage (about ^5 turns on the average) on a 
scale of oneto ten, with the fol lowing instruction: 

We would 1 ike to get ah Idea of how wel 1 the chl Idren In our study 
speak Engl • sh and Span I sh . On the bas [ s of your I mpress I pns from : 
looking at the transcript provided, please rank the two chl Idre^^^ 
(whose Initials or names appear above) on a scale of l-VOi with TO 
being excellent . 5 being jvferaae, and 1 being £22£« Use your own 
criteria as the basis for these evaluatlons/rahkihgs. 

To verify that Judges were responding in reasonably comparable ways 
to the task, we converted the ratings to ranks, and calculated Kendal 1/s 
Cbef f I c I erit of Concordance (S i egel , 1 956) for both the Engl I sh and _ 
. Spanish samples i There was sighlflcailt concurrence ecross judges for 
each sample (English: Kendal I's W -B.53i (12) - 31 .87, p < .01 j , 
Spanish: KeHdall' 5 H - 0.^88,, X2 (15) - 36.66, p < .pi). /The io«n of 
ranks across the five judges for each chi Id In eaciij anguage l.s given In 
Table. 10. ' ' ' ■ . ■ 

ebrrelatiohs for age, iiMsan judge|s prof Iclency rati ng/^^H^^ and 
n^an weights cal cut ate^l under each of the three max I mum 11 kel I ho^^ 
procedures are given In tafaie^l la and lib for Engl Ish and Sp^^^^^^ ^ 
respect ively. 
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Correlation Coefficients fo? Age and Eight Indicators of Ungoaae Proficiency 

ENGLISH 

Max tiketihood Estimates 





Age 


Judqes 


MLU 




I - 


2 






5a 


Age 


Ml W ■ 


*602 


.653 


• 6^9 


.495 








.661 


Judges 


Ik 






.^51 


• 402 


.537 


.556 




.579 


MLO 




n«5« 




.893 


.853 


.909 


.936 


.920 


.936 


Residual 


i 








.629 


^■-.937^ 


,SSk 


.935 


\.872 ■ 


M.t. 1 




n.5. 








^707 


.808 


.728 


.807 


M.t. 2 




n.s. 






it 




.962 


.998 


.96^ \ 


n.L. 3 












** 




.966 


V998:-: \, 






n.s- 






** 


** . 


** : ■ 




.966 


-M.L. 5a 












** 


** • 


, ** ■ 





■ Table lib 
SPANISH 

^ Max Likeli hood Estimates 







4o4qes 




3ies i dua 1 


I 


2 




. 4 


. 5b 


Age 




.627 


.195 


,l6b 


.579 




.406 


.405 


.358 


Judges 






.384 


; .319 


,m 


.464 


.504 


.448 


.483 


ftLU 


n.s. 


n.s. 




_ .93^ 


.68^* 


.886 


.934 


\.882 


. .940 


Res ) dua 1 


n.s. 


n.s. 


** 

V. 




.687 


.9^3 


.905 


.942 


.913 


H.L. 1 






** 


«* 




.852 


.859 


.854 


..859 


M.l. 2 






** 


** 


** 




.965 


.5(99 


.96r 


M.L. 3 


n.s. 




«« 


** 


** 


** 




.961 


.998 


M.L. k 


n«s. 


n.s. 


■ ** . 


** 


** " 


** 


** 




.965 


H.t. 5b 


h.s. 


n»s. 






** 


*«, 


: •** 


** 





* ft ^ -O^S D*< .01 
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Patterns^ Pfejaiionsh ios amond the niei sures^of^4aRquMe devetepment^^ . 
The straight measure of lengt^^^^^ 
r iith age only in English. HLU itfees ri^^ correlate .significantly ^Itfi >he, . 

judges' rankings in either langaaie. Thus HLU by i tsel f d<>es^-fK)tlse»n ^ 
^o show a particularly cdnsisteht relation to developnjent. f^i" both 
- lahguagest the judges' rankings correlate with age at about r - .6. 

The length measures wh^^^ 
produce resolts which do hot differ greatly frwn MtU; all Bui one : 
correlate with HLU In the r -.85— sSS range. The one exception !s 
Analysis V in Spanish^r * .687. the residuals froS the 
regression show rel at I onsfi I ps to age / 
comparable to, or son^hat lower thin those for MtU. Again the 
correlation is signifitant only witlj age, and only In English. 

. ' - ■ 4 _ _ - ^ ■ 

the patterns for the maximum I Tike i i hood weights are somewhat mixed. 

All except Analysis 1 correlate significantly with age In English; . 
exactly the opposite is true in Spa^is^ There only Analysis 1- 
correlates sigriificahtly with age, j^a ^ysls ■ 3 correlate|^h the 
judges' rankings In both languages. ] Ana!ysl*s~5a a.Jso corrcUtes- ' > 
significantly with both. age and judges' rank'lngs Ih English, b'ut not In 
Spanish* 

' As" might be anticipated from^ comparison «f the logH l^eiihoods In- 

Table 7, Analysis 2 and Analysis 4 produ<rc essentli^lly Identical 
results. Also, Analysfs 3 is extri^lr »»«nar to J^inalyses 5a and 5b. 

■ . ■ ■ ■ V 

Recognizing that there are, lathis sample of chiWren, numerous 
situations wB I ch intervene In the expected relationship between age and 
language, development, we ox^afed the various Indicators, derived frow^ ' 
^, ^ length with age and the Judgies' proficiency ratings, considered Jointly.- 

Separate multiple regressions were ei>«puted for.AtU, for Residual's, .an> 
l MCh of the weightings frow the Maximum LlRellhoo^tJ rstlmattoris tn ' 

English and In SpaiTlsh, for Age and Judges' Ranks as Independent ; 
fV ' . virlables. These •fift.aurtiai^liwdirfr^^^ cr ? . % A 




table 12 

Sonrr^^v of H^U iple Re9:^e55i6ns of Length MifSsures on Age 
\«ndl lodges* *Prof5cSem:y Rankings 

EflGLfSH 

Dependent . V Stendard 

Measure HiiUipJe R R2 ; Error 



Re?. I duo S ^» 



*,6gii .482 1.259 



.512 ; .262 . 0.689 

2 *.m ' ' 8.088 

K,t= i* - ' -662 ♦'•38 ' ' 0-092 

*.699 .488 0.096 

^ .fBt iV5\ , 2.36'* 

-32? .103 1*806 

.597 .356 6.08! 

.mb .237- 6-1^2 

,5?? -268 . " 0.1^8 

.m -226 ,6.1^6 

,i<98 .248 Oil^s 
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eonslderlng Age arid tHe judges' rankings jointly had esserit lally. nci 
effect whsisoewer for HtO, or for the residuals from the regression. It 
did, however* generally Increase the relationship shown to the various 
wejghtings, but only very slightly. 

Cot>c44^sions 

CTur genera! contention that discourse function or context 
ihflijehces utterance length seems well confirmed In these language 
samples. Any attempt to draw Inferences related to language development 
from examination of measures of length must accomtnodate these effects, 
at least for children as old as those observed here. The effect of 
discourse on length can be Identified variously in terms of fonc^lpn of 
turn or function of previous turn, or syntactically In terms of ellipsis 
or- number tii clauses. When hisnber of clauses Is considered, the effect 
of ellipsis is largely obviated, but the effects 6f discourse function 
are largely unchanged. Noting the speaker of the previous turn dofes not 
seem to be preferable to noting the function of the previous turn. 
Nothing Is gained by cohsidcrihg both speaker and function jointly. 

The differences across languages seem fairly small. ^There was no 
language effect in the analyses of variance. Under maximum like H hood, 
language factors were also faJrly small. When the data set was 
partitioned by language, this factors for fj^r^ct I bh show rather different 

^ ■ - - - - - - ' , ■ ' ". 

patterns- These differences, however^ had essenilally ho effect on the 
derived weightingsJ Analysis 3 and Analysis 5a, b produced :a1 most 
identical results* 

'From thi maximum ^likelihood calculations, it is possible to derive 
predicted probaBnity'dlstrfbutlbhs for length. These weights can, in 
'turn, replace length as a language developmant indicator. The resulting 
weights Wrel ate more; highly with judges' hoi rat!;.gs of 
proficiency thari d6es pU, In both English and Spanish. In Spanish, the 
correlation of the weights to Age is substantially higher than the 
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correlation of MLU; in Engllsht they are roughly comparable, with 
consideration ol number of clauses producing a somewhat lower 
correlation, A similar pattern results when Age ^nd Judges' Rankings 
are considered jointly In multiple regression* 

Al 1 of this is based on a rather sparse data set from relatively 
few children. Nonetheless, the concept of weights derived from 
frequency distr ibut Tons of length defined across discourse contexts 
appears extremely promising. The longitudinal study from which 'these 
data derive allows the possibility of enlarging the language samples 
from each child, and for looking at children across time. The ability 
to measure change within children across time will provide a truer test 
of the utility of the procedures. Whether or not if will be possible to 
demonstrate all five of the desirable measurement characteristics 
suggested above remains to be determined. 



\ 
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